Overview

Dataset statistics

Number of variables20
Number of observations4250
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory426.5 B

Variable types

NUM15
BOOL3
CAT2

Reproduction

Analysis started2021-11-04 18:40:12.041815
Analysis finished2021-11-04 18:40:56.725805
Duration44.68 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
state has a high cardinality: 51 distinct values High cardinality
total_day_charge is highly correlated with total_day_minutesHigh correlation
total_day_minutes is highly correlated with total_day_chargeHigh correlation
total_eve_charge is highly correlated with total_eve_minutesHigh correlation
total_eve_minutes is highly correlated with total_eve_chargeHigh correlation
total_night_charge is highly correlated with total_night_minutesHigh correlation
total_night_minutes is highly correlated with total_night_chargeHigh correlation
total_intl_charge is highly correlated with total_intl_minutesHigh correlation
total_intl_minutes is highly correlated with total_intl_chargeHigh correlation
number_vmail_messages has 3139 (73.9%) zeros Zeros
number_customer_service_calls has 886 (20.8%) zeros Zeros

Variables

state
Categorical

HIGH CARDINALITY
Distinct count51
Unique (%)1.2%
Missing0
Missing (%)0.0%
Memory size33.3 KiB
WV
 
139
MN
 
108
ID
 
106
AL
 
101
VA
 
100
Other values (46)
3696
ValueCountFrequency (%) 
WV 139 3.3%
 
MN 108 2.5%
 
ID 106 2.5%
 
AL 101 2.4%
 
VA 100 2.4%
 
OR 99 2.3%
 
TX 98 2.3%
 
UT 97 2.3%
 
NY 96 2.3%
 
NJ 96 2.3%
 
Other values (41) 3210 75.5%
 

Length

Max length2
Mean length2
Min length2
ValueCountFrequency (%) 
Uppercase_Letter 24 100.0%
 
ValueCountFrequency (%) 
Latin 24 100.0%
 
ValueCountFrequency (%) 
ASCII 24 100.0%
 

account_length
Real number (ℝ≥0)

Distinct count215
Unique (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.23623529411765
Minimum1
Maximum243
Zeros0
Zeros (%)0.0%
Memory size33.3 KiB

Quantile statistics

Minimum1
5-th percentile35.45
Q173
median100
Q3127
95-th percentile167
Maximum243
Range242
Interquartile range (IQR)54

Descriptive statistics

Standard deviation39.69840057
Coefficient of variation (CV)0.3960483996
Kurtosis-0.1321747749
Mean100.2362353
Median Absolute Deviation (MAD)27
Skewness0.1223273244
Sum426004
Variance1575.963008
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
90 53 1.2%
 
87 51 1.2%
 
93 50 1.2%
 
100 48 1.1%
 
120 48 1.1%
 
105 48 1.1%
 
116 47 1.1%
 
98 47 1.1%
 
127 47 1.1%
 
112 46 1.1%
 
Other values (205) 3765 88.6%
 
ValueCountFrequency (%) 
1 7 0.2%
 
2 2 < 0.1%
 
3 7 0.2%
 
4 2 < 0.1%
 
5 2 < 0.1%
 
ValueCountFrequency (%) 
243 1 < 0.1%
 
232 2 < 0.1%
 
225 2 < 0.1%
 
224 2 < 0.1%
 
222 2 < 0.1%
 

area_code
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size33.3 KiB
area_code_415
2108
area_code_408
1086
area_code_510
1056
ValueCountFrequency (%) 
area_code_415 2108 49.6%
 
area_code_408 1086 25.6%
 
area_code_510 1056 24.8%
 

Length

Max length13
Mean length13
Min length13
ValueCountFrequency (%) 
Lowercase_Letter 6 50.0%
 
Decimal_Number 5 41.7%
 
Connector_Punctuation 1 8.3%
 
ValueCountFrequency (%) 
Latin 6 50.0%
 
Common 6 50.0%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.3 KiB
no
3854
yes
 
396
ValueCountFrequency (%) 
no 3854 90.7%
 
yes 396 9.3%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.3 KiB
no
3138
yes
1112
ValueCountFrequency (%) 
no 3138 73.8%
 
yes 1112 26.2%
 

number_vmail_messages
Real number (ℝ≥0)

ZEROS
Distinct count46
Unique (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.631764705882353
Minimum0
Maximum52
Zeros3139
Zeros (%)73.9%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q316
95-th percentile36
Maximum52
Range52
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.4398822
Coefficient of variation (CV)1.761045147
Kurtosis0.2730383375
Mean7.631764706
Median Absolute Deviation (MAD)0
Skewness1.373091038
Sum32435
Variance180.6304335
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 3139 73.9%
 
31 69 1.6%
 
28 58 1.4%
 
29 57 1.3%
 
24 57 1.3%
 
33 55 1.3%
 
27 54 1.3%
 
26 53 1.2%
 
32 47 1.1%
 
30 47 1.1%
 
Other values (36) 614 14.4%
 
ValueCountFrequency (%) 
0 3139 73.9%
 
4 1 < 0.1%
 
6 2 < 0.1%
 
8 2 < 0.1%
 
10 4 0.1%
 
ValueCountFrequency (%) 
52 1 < 0.1%
 
50 2 < 0.1%
 
49 3 0.1%
 
48 4 0.1%
 
47 4 0.1%
 

total_day_minutes
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count1843
Unique (%)43.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean180.2596
Minimum0.0
Maximum351.5
Zeros2
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile91.59
Q1143.325
median180.45
Q3216.2
95-th percentile271.055
Maximum351.5
Range351.5
Interquartile range (IQR)72.875

Descriptive statistics

Standard deviation54.01237333
Coefficient of variation (CV)0.2996365982
Kurtosis-0.05670971637
Mean180.2596
Median Absolute Deviation (MAD)36.6
Skewness-0.006910229801
Sum766103.3
Variance2917.336473
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
189.3 10 0.2%
 
180 9 0.2%
 
184.5 8 0.2%
 
154 8 0.2%
 
177.1 8 0.2%
 
168.6 7 0.2%
 
230.7 7 0.2%
 
183.6 7 0.2%
 
197 7 0.2%
 
185 7 0.2%
 
Other values (1833) 4172 98.2%
 
ValueCountFrequency (%) 
0 2 < 0.1%
 
2.6 1 < 0.1%
 
6.6 1 < 0.1%
 
7.2 1 < 0.1%
 
7.8 1 < 0.1%
 
ValueCountFrequency (%) 
351.5 1 < 0.1%
 
346.8 1 < 0.1%
 
345.3 1 < 0.1%
 
338.4 1 < 0.1%
 
337.4 1 < 0.1%
 

total_day_calls
Real number (ℝ≥0)

Distinct count120
Unique (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.90729411764706
Minimum0
Maximum165
Zeros2
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile67
Q187
median100
Q3113
95-th percentile133
Maximum165
Range165
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.85081731
Coefficient of variation (CV)0.1986923726
Kurtosis0.1935936484
Mean99.90729412
Median Absolute Deviation (MAD)13
Skewness-0.08581246337
Sum424606
Variance394.054948
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105 101 2.4%
 
95 97 2.3%
 
110 92 2.2%
 
94 92 2.2%
 
112 90 2.1%
 
102 89 2.1%
 
97 88 2.1%
 
107 87 2.0%
 
100 85 2.0%
 
108 84 2.0%
 
Other values (110) 3345 78.7%
 
ValueCountFrequency (%) 
0 2 < 0.1%
 
30 1 < 0.1%
 
34 1 < 0.1%
 
35 1 < 0.1%
 
36 1 < 0.1%
 
ValueCountFrequency (%) 
165 1 < 0.1%
 
160 2 < 0.1%
 
158 2 < 0.1%
 
157 2 < 0.1%
 
156 3 0.1%
 

total_day_charge
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count1843
Unique (%)43.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.644682352941174
Minimum0.0
Maximum59.76
Zeros2
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile15.5735
Q124.365
median30.68
Q336.75
95-th percentile46.081
Maximum59.76
Range59.76
Interquartile range (IQR)12.385

Descriptive statistics

Standard deviation9.182096033
Coefficient of variation (CV)0.2996309744
Kurtosis-0.0565844345
Mean30.64468235
Median Absolute Deviation (MAD)6.225
Skewness-0.006912526228
Sum130239.9
Variance84.31088755
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
32.18 10 0.2%
 
30.6 9 0.2%
 
30.11 8 0.2%
 
31.37 8 0.2%
 
26.18 8 0.2%
 
31.45 7 0.2%
 
34.58 7 0.2%
 
29.58 7 0.2%
 
28.63 7 0.2%
 
28.66 7 0.2%
 
Other values (1833) 4172 98.2%
 
ValueCountFrequency (%) 
0 2 < 0.1%
 
0.44 1 < 0.1%
 
1.12 1 < 0.1%
 
1.22 1 < 0.1%
 
1.33 1 < 0.1%
 
ValueCountFrequency (%) 
59.76 1 < 0.1%
 
58.96 1 < 0.1%
 
58.7 1 < 0.1%
 
57.53 1 < 0.1%
 
57.36 1 < 0.1%
 

total_eve_minutes
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count1773
Unique (%)41.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200.17390588235293
Minimum0.0
Maximum359.3
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile118.2
Q1165.925
median200.7
Q3233.775
95-th percentile282.71
Maximum359.3
Range359.3
Interquartile range (IQR)67.85

Descriptive statistics

Standard deviation50.24951818
Coefficient of variation (CV)0.2510293135
Kurtosis0.04345320215
Mean200.1739059
Median Absolute Deviation (MAD)33.7
Skewness-0.03041458624
Sum850739.1
Variance2525.014078
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
230.9 10 0.2%
 
187.5 9 0.2%
 
194 9 0.2%
 
169.9 9 0.2%
 
199.7 9 0.2%
 
201 8 0.2%
 
216.5 8 0.2%
 
223.5 8 0.2%
 
209.4 8 0.2%
 
211.5 8 0.2%
 
Other values (1763) 4164 98.0%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
22.3 1 < 0.1%
 
37.8 1 < 0.1%
 
41.7 1 < 0.1%
 
42.2 1 < 0.1%
 
ValueCountFrequency (%) 
359.3 1 < 0.1%
 
352.1 1 < 0.1%
 
351.6 1 < 0.1%
 
349.4 1 < 0.1%
 
348.5 1 < 0.1%
 

total_eve_calls
Real number (ℝ≥0)

Distinct count123
Unique (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.17647058823529
Minimum0
Maximum170
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile67
Q187
median100
Q3114
95-th percentile133
Maximum170
Range170
Interquartile range (IQR)27

Descriptive statistics

Standard deviation19.9085911
Coefficient of variation (CV)0.1987352019
Kurtosis0.1145997215
Mean100.1764706
Median Absolute Deviation (MAD)13
Skewness-0.02081182363
Sum425750
Variance396.3519998
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105 98 2.3%
 
103 96 2.3%
 
91 95 2.2%
 
97 91 2.1%
 
94 88 2.1%
 
96 88 2.1%
 
108 88 2.1%
 
88 87 2.0%
 
101 86 2.0%
 
104 85 2.0%
 
Other values (113) 3348 78.8%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
12 1 < 0.1%
 
36 1 < 0.1%
 
38 1 < 0.1%
 
43 1 < 0.1%
 
ValueCountFrequency (%) 
170 1 < 0.1%
 
169 1 < 0.1%
 
168 1 < 0.1%
 
159 1 < 0.1%
 
157 1 < 0.1%
 

total_eve_charge
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count1572
Unique (%)37.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.015011764705886
Minimum0.0
Maximum30.54
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile10.05
Q114.1025
median17.06
Q319.8675
95-th percentile24.031
Maximum30.54
Range30.54
Interquartile range (IQR)5.765

Descriptive statistics

Standard deviation4.271211992
Coefficient of variation (CV)0.2510260969
Kurtosis0.04332949445
Mean17.01501176
Median Absolute Deviation (MAD)2.86
Skewness-0.03038789084
Sum72313.8
Variance18.24325188
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
16.12 13 0.3%
 
18.79 13 0.3%
 
14.25 13 0.3%
 
16.97 12 0.3%
 
15.9 12 0.3%
 
18.96 11 0.3%
 
16.8 10 0.2%
 
19.63 10 0.2%
 
17.09 10 0.2%
 
16.41 9 0.2%
 
Other values (1562) 4137 97.3%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1.9 1 < 0.1%
 
3.21 1 < 0.1%
 
3.54 1 < 0.1%
 
3.59 1 < 0.1%
 
ValueCountFrequency (%) 
30.54 1 < 0.1%
 
29.93 1 < 0.1%
 
29.89 1 < 0.1%
 
29.7 1 < 0.1%
 
29.62 1 < 0.1%
 

total_night_minutes
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count1757
Unique (%)41.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200.52788235294116
Minimum0.0
Maximum395.0
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile118.09
Q1167.225
median200.45
Q3234.7
95-th percentile282.71
Maximum395
Range395
Interquartile range (IQR)67.475

Descriptive statistics

Standard deviation50.35354807
Coefficient of variation (CV)0.251104971
Kurtosis0.1148535776
Mean200.5278824
Median Absolute Deviation (MAD)33.55
Skewness0.008490819348
Sum852243.5
Variance2535.479804
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
186.2 11 0.3%
 
208.9 10 0.2%
 
188.2 8 0.2%
 
169.4 8 0.2%
 
193.6 8 0.2%
 
230.1 8 0.2%
 
190.5 8 0.2%
 
228.1 8 0.2%
 
214.7 8 0.2%
 
214 8 0.2%
 
Other values (1747) 4165 98.0%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
23.2 1 < 0.1%
 
43.7 1 < 0.1%
 
45 1 < 0.1%
 
46.7 1 < 0.1%
 
ValueCountFrequency (%) 
395 1 < 0.1%
 
381.9 1 < 0.1%
 
381.6 1 < 0.1%
 
377.5 1 < 0.1%
 
367.7 1 < 0.1%
 

total_night_calls
Real number (ℝ≥0)

Distinct count128
Unique (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.8395294117647
Minimum0
Maximum175
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile67
Q186
median100
Q3113
95-th percentile132
Maximum175
Range175
Interquartile range (IQR)27

Descriptive statistics

Standard deviation20.09321979
Coefficient of variation (CV)0.2012551532
Kurtosis0.07721835856
Mean99.83952941
Median Absolute Deviation (MAD)14
Skewness0.005273110227
Sum424318
Variance403.7374815
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105 100 2.4%
 
99 92 2.2%
 
95 91 2.1%
 
102 90 2.1%
 
91 88 2.1%
 
94 88 2.1%
 
104 87 2.0%
 
98 87 2.0%
 
100 86 2.0%
 
109 85 2.0%
 
Other values (118) 3356 79.0%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
33 1 < 0.1%
 
36 1 < 0.1%
 
38 2 < 0.1%
 
40 1 < 0.1%
 
ValueCountFrequency (%) 
175 1 < 0.1%
 
170 1 < 0.1%
 
165 1 < 0.1%
 
164 1 < 0.1%
 
161 1 < 0.1%
 

total_night_charge
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count992
Unique (%)23.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.023891764705883
Minimum0.0
Maximum17.77
Zeros1
Zeros (%)< 0.1%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile5.3145
Q17.5225
median9.02
Q310.56
95-th percentile12.7255
Maximum17.77
Range17.77
Interquartile range (IQR)3.0375

Descriptive statistics

Standard deviation2.265921811
Coefficient of variation (CV)0.2511025033
Kurtosis0.1148651735
Mean9.023891765
Median Absolute Deviation (MAD)1.51
Skewness0.008444754041
Sum38351.54
Variance5.134401655
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
9.4 18 0.4%
 
9.63 17 0.4%
 
8.15 17 0.4%
 
10.8 17 0.4%
 
9.66 16 0.4%
 
8.82 15 0.4%
 
10.49 15 0.4%
 
9.76 15 0.4%
 
8.57 14 0.3%
 
10.35 14 0.3%
 
Other values (982) 4092 96.3%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1.04 1 < 0.1%
 
1.97 1 < 0.1%
 
2.03 1 < 0.1%
 
2.1 1 < 0.1%
 
ValueCountFrequency (%) 
17.77 1 < 0.1%
 
17.19 1 < 0.1%
 
17.17 1 < 0.1%
 
16.99 1 < 0.1%
 
16.55 1 < 0.1%
 

total_intl_minutes
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count168
Unique (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.256070588235294
Minimum0.0
Maximum20.0
Zeros22
Zeros (%)0.5%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile5.7
Q18.5
median10.3
Q312
95-th percentile14.6
Maximum20
Range20
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.760101726
Coefficient of variation (CV)0.2691188309
Kurtosis0.7029511928
Mean10.25607059
Median Absolute Deviation (MAD)1.8
Skewness-0.2413595394
Sum43588.3
Variance7.618161539
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11.1 75 1.8%
 
9.8 73 1.7%
 
11.4 73 1.7%
 
10.2 72 1.7%
 
10.9 71 1.7%
 
11.3 70 1.6%
 
10.1 69 1.6%
 
9.7 68 1.6%
 
9.5 66 1.6%
 
10.5 66 1.6%
 
Other values (158) 3547 83.5%
 
ValueCountFrequency (%) 
0 22 0.5%
 
0.4 1 < 0.1%
 
1.1 2 < 0.1%
 
1.3 1 < 0.1%
 
2 2 < 0.1%
 
ValueCountFrequency (%) 
20 1 < 0.1%
 
19.7 2 < 0.1%
 
19.3 1 < 0.1%
 
19.2 1 < 0.1%
 
18.9 1 < 0.1%
 

total_intl_calls
Real number (ℝ≥0)

Distinct count21
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.426352941176471
Minimum0
Maximum20
Zeros22
Zeros (%)0.5%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q36
95-th percentile9
Maximum20
Range20
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.463069113
Coefficient of variation (CV)0.5564556522
Kurtosis3.263227525
Mean4.426352941
Median Absolute Deviation (MAD)1
Skewness1.360122209
Sum18812
Variance6.066709454
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3 847 19.9%
 
4 795 18.7%
 
2 644 15.2%
 
5 598 14.1%
 
6 408 9.6%
 
7 272 6.4%
 
1 226 5.3%
 
8 153 3.6%
 
9 126 3.0%
 
10 59 1.4%
 
Other values (11) 122 2.9%
 
ValueCountFrequency (%) 
0 22 0.5%
 
1 226 5.3%
 
2 644 15.2%
 
3 847 19.9%
 
4 795 18.7%
 
ValueCountFrequency (%) 
20 1 < 0.1%
 
19 1 < 0.1%
 
18 4 0.1%
 
17 1 < 0.1%
 
16 7 0.2%
 

total_intl_charge
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count168
Unique (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7696541176470584
Minimum0.0
Maximum5.4
Zeros22
Zeros (%)0.5%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile1.54
Q12.3
median2.78
Q33.24
95-th percentile3.94
Maximum5.4
Range5.4
Interquartile range (IQR)0.94

Descriptive statistics

Standard deviation0.7452041364
Coefficient of variation (CV)0.2690603609
Kurtosis0.7033212689
Mean2.769654118
Median Absolute Deviation (MAD)0.48
Skewness-0.2416706661
Sum11771.03
Variance0.5553292049
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3 75 1.8%
 
3.08 73 1.7%
 
2.65 73 1.7%
 
2.75 72 1.7%
 
2.94 71 1.7%
 
3.05 70 1.6%
 
2.73 69 1.6%
 
2.62 68 1.6%
 
2.84 66 1.6%
 
2.57 66 1.6%
 
Other values (158) 3547 83.5%
 
ValueCountFrequency (%) 
0 22 0.5%
 
0.11 1 < 0.1%
 
0.3 2 < 0.1%
 
0.35 1 < 0.1%
 
0.54 2 < 0.1%
 
ValueCountFrequency (%) 
5.4 1 < 0.1%
 
5.32 2 < 0.1%
 
5.21 1 < 0.1%
 
5.18 1 < 0.1%
 
5.1 1 < 0.1%
 

number_customer_service_calls
Real number (ℝ≥0)

ZEROS
Distinct count10
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5590588235294118
Minimum0
Maximum9
Zeros886
Zeros (%)20.8%
Memory size33.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.31143353
Coefficient of variation (CV)0.8411700126
Kurtosis1.655618759
Mean1.559058824
Median Absolute Deviation (MAD)1
Skewness1.082691586
Sum6626
Variance1.719857904
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 1524 35.9%
 
2 947 22.3%
 
0 886 20.8%
 
3 558 13.1%
 
4 209 4.9%
 
5 81 1.9%
 
6 28 0.7%
 
7 13 0.3%
 
9 2 < 0.1%
 
8 2 < 0.1%
 
ValueCountFrequency (%) 
0 886 20.8%
 
1 1524 35.9%
 
2 947 22.3%
 
3 558 13.1%
 
4 209 4.9%
 
ValueCountFrequency (%) 
9 2 < 0.1%
 
8 2 < 0.1%
 
7 13 0.3%
 
6 28 0.7%
 
5 81 1.9%
 

churn
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.3 KiB
no
3652
yes
 
598
ValueCountFrequency (%) 
no 3652 85.9%
 
yes 598 14.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

stateaccount_lengtharea_codeinternational_planvoice_mail_plannumber_vmail_messagestotal_day_minutestotal_day_callstotal_day_chargetotal_eve_minutestotal_eve_callstotal_eve_chargetotal_night_minutestotal_night_callstotal_night_chargetotal_intl_minutestotal_intl_callstotal_intl_chargenumber_customer_service_callschurn
0OH107area_code_415noyes26161.612327.47195.510316.62254.410311.4513.733.701no
1NJ137area_code_415nono0243.411441.38121.211010.30162.61047.3212.253.290no
2OH84area_code_408yesno0299.47150.9061.9885.26196.9898.866.671.782no
3OK75area_code_415yesno0166.711328.34148.312212.61186.91218.4110.132.733no
4MA121area_code_510noyes24218.28837.09348.510829.62212.61189.577.572.033no
5MO147area_code_415yesno0157.07926.69103.1948.76211.8969.537.161.920no
6LA117area_code_408nono0184.59731.37351.68029.89215.8909.718.742.351no
7WV141area_code_415yesyes37258.68443.96222.011118.87326.49714.6911.253.020no
8IN65area_code_415nono0129.113721.95228.58319.42208.81119.4012.763.434yes
9RI74area_code_415nono0187.712731.91163.414813.89196.0948.829.152.460no

Last rows

stateaccount_lengtharea_codeinternational_planvoice_mail_plannumber_vmail_messagestotal_day_minutestotal_day_callstotal_day_chargetotal_eve_minutestotal_eve_callstotal_eve_chargetotal_night_minutestotal_night_callstotal_night_chargetotal_intl_minutestotal_intl_callstotal_intl_chargenumber_customer_service_callschurn
4240AR127area_code_415noyes27157.610726.79280.64923.8575.1773.388.042.161no
4241WA80area_code_510nono0157.010126.69208.812717.75113.31095.1016.224.372no
4242MN150area_code_408nono0170.011528.90162.713813.83267.27712.028.322.240no
4243ND140area_code_510nono0244.711541.60258.610121.98231.311210.417.562.031yes
4244AZ97area_code_510nono0252.68942.94340.39128.93256.56711.548.852.381yes
4245MT83area_code_415nono0188.37032.01243.88820.72213.7799.6210.362.780no
4246WV73area_code_408nono0177.98930.24131.28211.15186.2898.3811.563.113no
4247NC75area_code_408nono0170.710129.02193.112616.41129.11045.816.971.861no
4248HI50area_code_408noyes40235.712740.07223.012618.96297.511613.399.952.672no
4249VT86area_code_415noyes34129.410222.00267.110422.70154.81006.979.3162.510no